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Finding the optimal assignment in budget-constrained auctions is a combinatorial optimization 
problem with many important applications, a notable example being the sale of advertisement space 
by search engines (in this context the problem is often referred to as the off-line AdWords problem). 
Based on the cavity method of statistical mechanics, we introduce a message passing algorithm 
that is capable of solving efficiently random instances of the problem extracted from a natural 
distribution, and we derive from its properties the phase diagram of the problem. As the control 
parameter (average value of the budgets) is varied, we find two phase transitions delimiting a region 
in which long-range correlations arise. 



I. INTRODUCTION 
A. Problem definition 

We consider the following model of budget-constrained auctions, inspired from the sale of advertisement space on 
internet search engines: a set of advertisers a € {1, • • • , A^a} are interested in appearing on the results pages of the 
searches relative to some keywords fc G {1, ■ • ■ , A^k}- Advertiser a offers to pay a hid Wka S IR^ in order to appear on 
the results page each time that keyword k is searched. We assume that all the bids are expressed before the auction 
begins. In this setting, advertiser a doesn't know how many auctions he will end up winning, and therefore how much 
he will spend. In order to encourage the advertisers to make more bids without risking to spend too much, each 
advertiser a can also specify a budget Ba G R"*" which is the maximum sum that he is willing to pay in a given period 
of time. 

The general problem we want to solve is the following: given the sets of bids {wka} and of budgets {Ba}, what is 
the assignment of each keyword k to some advertiser a which maximizes the total revenue for the search engine? We 
can represent a possible assignment by introducing the binary variable Xka which will take the value 1 if keyword k is 
assigned to advertiser a and otherwise. The constraint that each keyword be assigned to one and only one advertiser 
then takes the form 



E 



Xka = 1 (Vfc) (1) 



while the budget constraint takes the form 



^XkaWka<Ba (Vfl) (2) 



k 

and the quantity we want to maximize is the revenue of the search engine 

^^XkaWka (3) 

a k 

where x = {xka}- In order to maximize the revenue, we shall make it possible to sell keywords at a discounted price 
when this allows to saturate the budget of some advertiser which would otherwise remain unsaturated, and define the 
revenue as 



Ba, XkaWka 
k 



(4) 



Clearly, this problem is a variant of the weighted bipartite matching in which the budget constraint has been added. 
It is also a special case of the general linear Resource Allocation Problem on binary variables, which consists in finding 
{xi} G {0, 1}" that maximizes J2i '^iXi under the constraints ^ bijXj < Ci (j = 1, • • • , m). 



B. Summary of known results 

Both the on-Une and the off-line versions of the AdWords problem have been the object of considerable attention 
in recent years. The off-line version is NP-hard even if there are only 2 advertisers An exact algorithm to solve 
it with time complexity 0(7Va4^'') and an approximate algorithm based on Integer Programming relaxation with an 
approximation ratio^ e/(e — 1) (on the revenue) arc presented in An improvement over this result is provided by 
[3|, where an algorithm with approximation ratio 3/2 is introduced. An approximate algorithm for the on-line version 
of the problem with competitive ratio'^ e/(c — 1) is introduced in 

These results are concerned with the worst-case analysis of the performance (scaling of time and approximation 
ratio) of the algorithms. In what follows, we shall be interested in a different question: is there an algorithm which 
has a good performance on average for some given ensemble of instances of the problem? The ensemble of instances 
will be specified by defining the distributions of the bids and of the budgets. 



II. STATISTICAL MECHANICS ANALYSIS OF THE PROBLEM 



A. Factor graph representation and probability marginals 



We want to consider the problem of budget-constrained auctions as a statistical mechanics system, as done for 
example in The configurations x = {xka} will represent the assignments {xka = 1 if keyword k is assigned to 
advertiser a and otherwise), and the energy function E{x) will represent the portion of the advertisers budgets 
which remains unspent: 



Eix)=J2EaiXa) = J2' 



0, Ba- ^ WhaXka 
k^da 



(5) 



where da = {k\wka > 0}, and Xa = {xka\k G da}. 

We want to include the constraint ^ that each keyword be assigned to one and only one advertiser as a hard 
constraint. We therefore write the Boltzmann-Gibbs distribution at inverse temperature P as: 



1 



Z{(3) 



E 



Xka 



(6) 



where the partition function Z{f3) is a normalization. 

This factorization of the probability distribution into local terms involving only some variables and corresponding 
to the assignment constraints U]) and the energy terms (O suggests a factor graph representation Q in which an 
instance of the problem is associated to a bipartite graph Q = {F, V; E) where F = {a] U {fc} is the set of function 
nodes corresponding to the constraints, V = {xka\ is the set of variable nodes and E is the set of edges {a,Xka) 
and {k,Xka) such that Wka > 0. In the following we shall assume that as A'a, A^k oo, the average number of bids 
expressed by each advertiser remains finite. This kind of diluted systems can be successfully treated with the cavity 
method of statistical mechanics 0, Q , originally applied to optimization in [^, [l3| . 

Given an instance of the problem Q = {F, V; E), let us consider a modified instance C/^'^-' in which the hmction node k 
is removed, and another instance CJ^"-' in which the function node a is removed (the rest of these graphs being identical 
to G). The name cavity refers to the missing node in Q^''^ and in Q^°'\ We shall now make an assumption which goes 
under the name of Replica Symmetric approximation (RS), and which will be validated ex-post: we shall assume that 
variables x^a and Xjb that are far away on the graph are uncorrelated, i.e. that their marginal distribution factorizes: 
P{xka, Xjb) = P{xka)P{xjb)- Under this assumption, when a function node is removed from a diluted random factor 
graph, the variables connected to the removed node become far away from each other and therefore uncorrelated. Let 
us then consider the marginal distribution P^'^^Xka) for variable x^a in the absence of node a: the value of Xka is 



^ The approximation ratio is defined as the ratio between the best possible revenue and the revenue achieved by the algorithm in the 
worst case instance. 

^ The competitive ratio is defined as the ratio between the best possible revenue and the revenue achieved by the algorithm for the worst 
case sequence of choices of the on-line instance. 



influenced only by the constraint in node k, so that 



{xkh\b&dk\a} 



Xka 



E 

bedk\a 



Xkb 



P{{xkb\bedk\a}) 



(7) 



where A/"^""* is a normaHzation factor, dk = {ajw^a > 0} and the sum is over the 21^*^1"^ possible configurations of the 
variables {xkb\b € dk \ a}. In the limit N, M —> oo, the joint marginal probability P {{xkb\b dk \ a}) appearing on 
the right hand side can be replaced with the cavity joint marginal probability P'-'"-' {{xkb\b & dk\ a}) (this corresponds 
to a self-consistency assumption on the cavity marginal distributions for systems with (iVa, A^k) function nodes and 
systems with (iVa — 1, iVk) or (iVa, A^k — 1) function nodes, as explained in detail in 0, But in the absence of node 
k, the variables Xkb are far away in the graph, and the RS assumption implies that the joint marginal distribution 
factorises: 



{xkb\b&dk\a} 



Xka + E Xkb = 1 
be8k\a 



n ptb\^kb) 

be8k\a 



With the same argument we obtain 



Ptt'M=<' E 



{xja\jeda\k} 



n p^:\-^^) 

j€da\k 



Since the variables are binary, the cavity marginal distributions have only one parameter and we shall 
them in terms of cavity fields Hka and Uka- 



Ptt\^) 



Pja (0) ^ ^-pUka 



(8) 
(9) 

represent 
(10) 



In terms of the cavity fields, the self-consistent equations for the cavity marginals become: 

'^bedk\a Pjb (-*-) Y\cedk\{a,b} Pjc (Q) 



-/3C/fc 



Y\bedk\a Pkb 

>{0) 



E 

bedk\a 

Z^a;„\fc ^ llj<^da\k ^ja \Xja) 

l^x^-^k llj£aa\k ^ja \Xja) 



(11) 



-P[Ea(0,x^-^k)-x^-^k-U^\k] 



V p-/3[Sa(l,2;a\A,)-a:„\fc-(7„\fc] 



(12) 



where Xa\k = {xja\j <E da\ k}, Ua\k = {Uja\j e da\ k} and x^^k ■ Ua\k = J2jeda\k ^jaU-ja, and where 



0, Ba - XkaWka 



j£da\k 



(13) 



We can solve the system of coupled equations ([Tl]) and (fT2l) by iteration: we start with some random values for the 
cavity fields {H^g.} compute {Ul^} using pT|) with the values of {H^o.}^ ^^^^ we compute {Hl^} using and 
the values of {Ul^^}; we iterate this procedure until the values {Ul^} and {Hl^} are a fixed point of the system pT|) 
and (112p within some prescribed numerical accuracy. 



Zero temperature limit of the probability marginals 



As we mentioned before, the optimal assignments of a given instance of the problem will correspond to the con- 
figurations with lowest energy. In order to select them, we shall take the zero temperature limit /? oo. When 



doing this, we must keep in mind that in case there are multiple optimal assignments, some of the variables will take 
a unique value in all of them, and in physical terms they will be completely "polarized" , while other variables will 
be able to take different values in different optimal assignments. In order to take this into account, we are going to 
assume that as /3 oo the cavity fields scale as 



(14) 



where hka and Uka are known as hard fields and S,ka and rjka are known as soft fields. The soft fields become crucial 
for those variables for which hka + Uka ~ (these variables are precisely those that are not fully polarized) , and they 
are irrelevant for the other variables. 

Assuming the scaling (|14p and substituting it into the equations for the cavity fields (fTT|) and (fl^ we obtain, by 
comparing the terms of 0(1) and those of 0(/3^^): 



Uka 



Vka 



max hkb 

bedk\a 



log 



E 

{b£dk\a\ hkb - 



and 



hka 
^ka 



g*{Ba)-g*{Ba-Wka) 



log 



E 



- log 



E 

Xa\k&{xl^kiBa)} 



QXa\k-na\k 



where 



g*{z) = ming{z,Xa\k) 



Xa\k 



g{z,Xa\k) 



[O, Z - Xa\k ■ Wa\k] - Xa\k ' Ua\k 



(15) 
(16) 

(17) 
(18) 

(19) 
(20) 



are functions of the hard cavity fields {uja} even though we don't write explicitly this dependency, and where the two 
sums in (fT5|) are restricted to local configurations x'^x^f.^Ba — Wka) such that g{Ba — Wka, xl^\i,{Ba — Wka)) = g*{Ba — Wka) 
in the first sum and x*x^i^{Ba) such that g(Ba, x*^^i^(Ba)) — g*{Ba) in the second sum. 

It is possible to give a simple physical interpretation of the hard and soft fields by considering the following 
quantities: 



Kai^ka) 



(x) 



{x,b\Ub)=^ika)} 



{xjt\(jb}^{ka)} 



(21) 
(22) 



where E'^'^^x) is the energy of configuration x on the cavity graph Q^°-\ Clearly, ^kni-'^ka) is the average value of 
the energy (at temperature /3) on the cavity graph C/*-"^ when the variable Xka is constrained to take a given value, 
while \og{n^i^J (xka)) is the value of the entropy under the same conditions. When j3 — > oo, e^i^J{xka) tends to the 
minimum value of the cavity energy over the configurations with given Xka, while n^i°^(xka) tends to the number of 
configurations that realize this minimum. 

The cavity bias for variable Xka can be written as 



{x.tlUbmka)} 



-/3_E(")(a;fc,=0,{x,i,}) 



-/34°o'(0)„('^) 



{x,,\Ub)=i{ka)}' 



-I3EM (xka = l,Ujb}) 



-/34°J(l)r,i") 



from which we identify 



hka = 4:^(0) -4:^(1), 
6a = iog4:'(i)-iog4:)(o) 



(23) 



(24) 
(25) 



The hard field hka is then equal to the difference in the average cavity energies of configurations with x^a ~ relative 
to configurations with Xka = 1, while the soft field ^ka is equal to (minus) the difference in the cavity entropies between 
the same configurations. The fields Uka and rjka associated to a keyword cavity (k) can be interpreted in the same 
manner. 

A few remarks are in order. First, notice that the equations p5ll7p for the hard fields {hka} and {uka} are 
independent of the soft fields {£,ka} and {rika}- One can then split the computation into two parts: first, the fixed 
point of the recursion (fTSlfTT]) is found for the hard fields, giving {h^^} and {u^^}; then, these values are substituted 
(as constants) into the recursion p61 [TBI) , which is solved. This is usually faster and simpler than doing a parallel 
update. 

The second remark concerns the absolute values and the signs of the cavity fields: it is easily seen that g*{z) is a 
continuous non-decreasing function of z, so that the fields hka are positive reals'^ or zero; this in turn implies that the 
fields Uka are negative reals or zero. In a way, this could be expected from the message-passing interpretation of the 
fields: the messages Uka "express" the constraint that keyword k be assigned to only one advertiser, and therefore tend 
to bias the variables towards (corresponding to a negative field) , while the messages hka "express" the requirement 
that the energy be as low as possible compatibly with the budget constraint, and therefore tend to bias the variables 
towards 1 (corresponding to a positive field). Similarly, it can be seen that the messages ^ka must be positive or null, 
so that the messages rjka must be negative or null (the argument however is less trivial and we shall omit it). 

As a final remark, let us show that all these update rules can be computed efficiently. This is obvious for the 
equations (|15p and (|16p , which involve finding the minimum among c — 1 numbers (where c is the connectivity of 
the keyword considered). By looking for the two largest values of hka for a given fc, one can update all the Uka and 
r]ka messages on the edges incident on k in linear time in c. For the equations (fT7|) and it is less obvious: these 
equations require finding the minimum over the local configurations Xa\k of a function g{z,Xa\k), i-e. to compute it 
in 2*^^^ points, where c can be very large as A^a,-^k ^ oo. However, this can also be computed efficiently. Let us 
consider the fimctions 5,* (z) obtained by minimizing g(z, {xi, • • • , Xn}) over the first n variables, which we denote by 
Xi for simplicity. The functions (7^(2) obey the recursion: 



gn+i{z) = , niin I max 



(26) 



n+l 

XiWi 

i=l 

= min [g^iz), gl{z - Wn+i) - Un+i] (27) 




with the initial condition 



55 (z) = max[0, z] . (28) 



It is then possible to compute the function g*{z) by iterating the recursion (|27p . This can be done either approximately, 
discretizing the values of z, or exactly, by exploiting the fact that g*{z) is a function which has slope equal to either 
one or zero in any point, and that the number of points where it changes slope remains finite as the connectivity 
increases (this actually depends on the distribution of the values of the bids {wka} and of the fields {uka}^ but it is 
true for the distributions we shall consider) . The sums appearing in the logarithms of (jlSp can be computed together 
with the function g*{z) in a similar way. 

C. Average energy and entropy in terms of the fields 

For a given instance, the average energy and the entropy at zero temperature can be computed as functions of the 
fields {hka}, {(,ka} (or equivalently {uka}, {Vka}) by writing in their definitions, 

E - ^P(x)i?(a;), (29) 

X 

S = -^P(x) logP(x) (30) 



^ Notice that this contrasts many other combinatorial optimization problems in which the energy, and therefore the hard fields, are discrete 
quantities. 



the distribution P{x) in terms of the values of the fields at the fixed point: 

Fix) ^Y[PaiXa) Y[Pkixk) n Pkaixka)-^ , (31) 
a k (ka) 

where as before Xa = {xka\k € i9a}, Xk = {a;fca|a G dk}, and 

Pa{Xa) - AC' exp {-(3 [Ea{Xa) ~ Xa ■ Ua]} , (32) 



Pk{xk) = cxp{(3xk-Hk} 1 



.6e9fc 



(33) 



) = exp{P Xka{Hka + Uka)} (34) 



and where A/'a, Afk and A/fca are normalizations. 

From these equations one obtains the average energy in the limit /3 



^ l^x^e{x'} "iax [U, I^a ~ Xa ' Wa\ e - 

^ = ^ y , ,e-^-v^ 



with the same notations as in the previous paragraph: Ua = {uka\k G da} and similarly for rja and Wa', Xa ■ rja = 
J^keda^kaVka and similarly for Xa ■ Ua and Xa ■ Wa] g{Ba,Xa) = max [0, Ba - Xa ■ Wa] - Xa ■ Ua, which is defined 
similarly to equation (j20p . except that all the variables incident on a are considered instead of all but one; g*{Ba) = 
minx^ g{Ba, Xa)', {x*} is the set of partial configurations Xa that realize g*{Ba). When all the {uka} are non-zero, 
this expression simplifies greatly: 

S = ^ max [0, Ba- x*a- Wa] (36) 

a 

where x* is any of the (possibly degenerate) partial configurations that minimize g{Ba,Xa)- Such a configuration, in 
turn, is easy to compute if all the {hka} are different from zero, as can be seen by writing explicitly the minimization 
with respect to one of the variables, say Xka'- 

ming{Ba, Xa) = min< min [max (O, Ba - Xa\k ■ Wa\k) - Xa\fe ■ "a\fc] , 

(0, Ba - Wka - Xa\k ' Wa\k) ~ Xa\k ' Ua\k - Wfca] \ (37) 



mm max 

= min [max (O, Ba - Xa\k ■ Wa\k) - Xa\k ■ Ua\k] + min(0, -hka - Uka) (38) 

where the choice of the first term in the two-term minimum corresponds to x*j^^ — while that of the second corresponds 
to x*f^^ = 1, provided hka 7^ 0. Under the combined assumption Uka 7^ 7^ hka y{ka) we have x^^ ~ d{hka + Uka), 
where d{x) = 1 for 2; > and 9{x) = otherwise. 

The expression of the entropy is somewhat more complicated: 



xae{x'j '^xae{x'j 

v-^ J ■r-^ f Y.aedk:hka.=hl ^kaC^ 

k y aedk:hka=K ^aedk:hka=hl 

~ E 1 1'^- + = 0] (log (1 + -"'^^''^) - I (39) 

ka ^ ■' 

with the same notations as above and h^. = maxagafc hka- 

Once the fixed-point fields {hka,S.ka} (or equivalently {ukajVka}) are known, it is straightforward to compute the 
other set of fields via (fT5l [T6|) or via (fTTl [T8|) , and then to compute the energy and entropy of the ground states via 



III. AN EFFICIENT ALGORITHM FOR RANDOM INSTANCES OF THE PROBLEM 



A. Characterization of the random ensemble of instances 

An instance of the problem can be represented as a bipartite graph X ~ {K,A;E) in which K = {!,• • • ,iVk} is 
the set of keywords, A = {!,••• , iVa} is the set of advertisers, and E = {1, • • • ,Ne} is the set of the edges (ka) 
such that Wka > (notice that this graph should not be confused with the factor graph introduced earlier). We 
assume the graph to be random with given iVa, A^k and TVo, and that in the thermodynamic limit iVa, A^k, oo the 

average connectivities Ca = N^/Ni^ and Ck = Nc/N]^ both remain finite. We consider the (non-zero) bids Wka to be iid 
random variables with uniform distribution in ]0, 1]. Given the bids {wai, ■ ■ ■ yWac} offered by an advertiser a with 
connectivity c, any budget smaller than B™™ = min^ Waj or larger than B™^^ = Waj will be meaningless. We 
shall therefore consider a distribution of budgets correlated to the connectivities and to the bids, in such a way that 
each budget Ba is constrained to the interval [S™™, i?™^'']. The most "natural" distribution of budgets is obtained 
by conditioning the uniform distribution in B^a'^^] have a given sum, specified by a parameter h G [0, 1]: 

P [{Ba}\ b, {wka}] (X S ( ^ [Bf^ + b (B-- - Bf^)] n ^(^- " Ban OiBr"" ~ ^a) • (40) 

\ a a / a 

As iVa — > oo the "reduced" budgets ba = (Ba — -B™'")/(i?™^'^ — _B™™) become iid random variables with an exponential 
distribution in the interval [0, 1] with parameter 7 given by the (unique) solution of 1 + l/(e''' — 1) — I/7 = &. This 
makes it very simple to generate large instances from this distribution. In the thermodynamic limit the average 
budget B is related to the parameter b by the linear relation: 

B='—(e-^-l^c,){l~b) + ^b. (41) 

Ca ^ 

We shall denote this random ensemble as (a{Ng„N]i,Nc,B) for finite-size instances and as ^oo(ca,Ck, 6) in the 
thermodynamic limit. In the following sections, we shall discuss how the properties of and vary with these 
parameters. Of course, one can consider many other ensembles of instances. For example, one could consider an 
exponential distribution for the bids, or remove the correlation between bids and budgets. In fact, only the distribution 
of the actual bids and budgets can claim to be more interesting than the other possible ones, and the only motivation 
for our choice is that it is reasonable and simple. 

B. Definition of the algorithm 

We shall now define an algorithm, based on the analysis of the previous section, which finds efRciently optimal 
(or almost-optimal) configurations for random instances of the problem extracted from the distribution described in 
the previous paragraph, and which coincides with the zero-temperature Belief Propagation (BP) algorithm, with the 
cavity fields playing the role of messages, and the self-consistent equations for the cavity fields playing the role of 
update rules. We shall see that in order to find an optimal configuration of the problem it is sufficient to compute 
the hard fields, so the update equations for the messages will be p5p and p7p . which we rewrite as h^a ~ '^ka{ua\k) 
and Uka = ^ka{hk\a) ■ Foi' the sake of concreteness the pseudo-code of the Belief Propagation algorithm is written 
in Algorithm IIII.ll The inputs arc an off-line AdWords instance X, an initial set of messages {h^^}, a convergence 
criterion '^^{{h^o}, e) G {true, false}, and a maximum number of iterations T . 

This algorithm is successful in finding an optimal configuration only if the following two conditions are met: the 
messages must converge to a fixed point, and at the fixed point h^a + Wfca must be non-zero for all (fca). As we 
shall see in the following section, these conditions are not always verified for random instances extracted from the 
random ensemble S'{Nii, Ny^, Nc, B) described in paragraph IIII Al However, we shall see that it is possible to modify 
the update equations and PT|) so that a solution is (almost) always found. 

IV. NUMERICAL RESULTS 
A. Convergence of the zero-temperature BP 



We have implemented the Algorithm IIII.ll and used it to solve random instances from the distribution S' described 
in paragraph IIII Al with several values for the distribution parameters A'a, A'k, and b. 



Algorithm III.l: Zero-temperature BP(I,{h'-l,,},'^,T) 



t ^ 

while (^^({/iL},e)) and {t < T) 
'for each (ka) £ E 

do '^Uka{hl^a) 

do {t^t + l 

for each {ka) £ E 



do hi. 



for each (ka) £ E 

do ^-WfcaClfcv) 

if (^({ftL},e)) and (/iL +«L / V(fca)) 

(for each ka E E 
do xka <- e{hi^ + 4 J 
return (x) 
else return (Undetermined) 



In Figures [T] and [3] we study the convergence behavior of the algorithm as the budget distribution parameter b is 
varied, while the other parameters are kept fixed: A^a ~ 1 000, A'k = 3 000, A^o = 10 000. The algorithm is run with 
the following parameters: the initial values of the messages are set to h^^ = for all (ka), the maximum number of 
iteration is T = 2 000, the convergence criterion is 



{(ka) 



with e = 10^^. A total of 31 736 instances are generated with values of b in the range [0.17, 0.45], and they are grouped 
according to the actual value of the average budget B, in bins of amplitude 0.02 (recall that since the A^a individual 
budgets are extracted as iid random variables from a distribution with parameter b, their average B is also a random 
variable). 

For a given bin, if the total number of samples is n and the number of samples for which convergence is obtained 
is m, the distribution for the probability (over the choice of the instance) of convergence p is computed as follows'': 

TP\m\n,p] = ( |p"(l ~ P)""" =^ Pb|n, m] = (n + l)('^]p"'(l - p)""™ (43) 
\m J \m J 

so that the expected value of the convergence probability p and its standard deviation Sp are: 

- rn + 1 

P = — ^, (44) 




= \hrT^^-zrT^- (45) 



In Figure [U we show the results for p and Sp as a function of B. For small values, B < 1.4, the algorithm almost 
always converges, while for 1.4 ^ B < 1.5 there is a sharp decrease in the probability of convergence. For larger 
values, 1.5 < B, the probability of convergence gradually increases back to 1, reaching the value p ~ 0.5 for B ~ 1.8. 

The equivalence of the convergence of our algorithm with the RS assumption together with the above results 
suggest a phase diagram (in terms of B) with a RS phase for i? < Bq, a Replica Symmetry Broken (RSB) phase for 
Bq < B < Bi and again a RS phase for Bi < B, where i?o and Bi are functions of the average connectivities Ca and 
Ck and are roughly equal, for Ca = 10 and Ck = 10/3, to 1.4 and 1.8 respectively. The existence of a phase transition 
at B ~ 1.4 is supported by the scaling of p with the size of the instance (i.e. A^a) shown in Figure O which indicates 
that slightly above B = 1.4 in the infinite size limit p tends to zero. The transition at B = 1.8 on the other hand 



* Assuming a uniform prior over p. 



Average budget distribution parameter b 



0.2 0.24 0.28 0.32 0.36 0.4 0.44 




Average budget B 

FIG. 1: Convergence of the algorithm for different values of the average budget B. The other parameters of the instance 
distribution are as follows: A^a = 1000, A'k = 3 000 and A'o = 10 000 (the parameters for the algorithm are specified in the 
text). The samples are grouped according to their average budget B in bins of amplitude 0.02 (the horizontal errorbars are not 
shown for simplicity); the corresponding values of the distribution parameter b, computed according to (|4H) . are shown on the 
upper scale. The number of samples in each bin is approximately 500. 



does not appear to be sharp. This is not in contradiction with the suggested phase diagram, since the convergence 
probabihty on finite size instances might be affected by the presence of local structures where the fields do not reach 
convergence, but which do not contribute to the properties of the ensemble in the infinite size limit, as happens for 
example in the random (2 +p)-XORSAT. Indeed, wc shall see that the analysis of the stability of the RS solution (in 
paragraph IIV B|) confirms this phase diagram. 

It should be stressed that these results on convergence could, in principle, depend heavily on the choices of the 
convergence criterion ''^ {{hl.^} , e) , of the maximum number of iterations allowed T, and of the initial values of the 
messages h^^. We find that the choice of the convergence criterion has very limited impact, as long as it is a reasonable 
one. We experimented with several values of e between 10^^*^ and 10^^, and we also tried other possible criteria, such 
as the following two: 

'^'i{hL},e) = |max[max(|/^L-d|4a-4;'|)] < A j^C/^L + 4a) = ^(C + "L'): V(fca)| ,(46) 

"^"({hLh e) ^ !^9{hl^ + 4 J ^ eihlJ + 4;i) = . . . = 0{h[-^ + ui-% V(fca)| . (47) 

In all these cases, we find very similar results. We also tried an alternative for the initial condition: setting h^^ 
to random values extracted uniformly in the interval [0, 1[. Again, this has very limited impact on the convergence 
behavior of the algorithm. Finally, wc analyzed the possible effect of the cut-off on the number of iterations allowed. 
In Figure [3] we show the average number of iterations Ui required for convergence, together with the fraction of runs 
that converge and require less than 1000 iterations (i.e. half the maximum number of iterations allowed, T). By 
comparing the two plots, and assuming that the distribution of Ui is unimodal when T —^ oo, it seems clear that the 
effect of this cut-off is negligible for the samples considered. 

In Figure m we sec that for S < 1.5 a large fraction of the variables arc unfrozen, that is to say that the fields 
associated to them verify hka + u^a = 0, so that the simple prescription used by Algorithm IIII.ll is unable to assign 
them. The existence of regions of values of B where the messages often do not converge to a fixed point, or where 
the fixed point corresponds to a large number of unfrozen variables, constitutes a serious problem for the practical 
use of our algorithm. However, as we mentioned before, it is possible to "fix" it with two simple modifications of the 
update equations and PT)) . 
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FIG. 2: Scaling of the convergence probability p with the number of advertisers A^a for fixed Nj^/N;^ = 3, A^c/A^a ~ 10 and 
B = 1.44. The data points are very well fitted by p oc A'^^ with the exponent v — —1/2, indicating a sharp transition in the 
infinite size limit. 
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FIG. 3: Number of iterations required for convergence (same samples as in Figure [T]). The bottom curve (left scale) shows the 
average number of iterations required for convergence and its standard error. The upper curve (right scale) shows the fraction 
of the samples that converge requiring less than 1 000 iterations. 



The probability of convergence pof the algorithm is greatly improved if the update rule for the fields /i^^^ is modified 
to include a reinforcement term jlll |. The modified rule takes the form: 

hia^nU<-{l) + p{hi-' + <-') (48) 

where p is a constant real number which we shall refer to as reinforcement parameter. In practice, the value taken by 
the new field is a linear combination of the value prescribed by the update equation (jl7p and the value of the full field 
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FIG. 5: Effect of reinforcement on the convergence of the algorithm. The parameters of the instances distribution are the same 
as in Figure [T] but a reinforcement term with parameter p is included in the update equation (|48p . 

(as opposed to the cavity one) acting on the variable being updated, which is equal to h'j^^ + u^j^^. The coefficient p 
is used to "tune" the strength of this term. 

The results of this modification in the update rule are quite spectacular, as can be seen on Figure [5l where the 
probability of convergence p is compared, for the same samples, with different values of the reinforcement parameter 
p. The parameters of the distribution of instances are the same as in Figures [T] - HI i.e. N^, = 1000, iVk = 3 000, 
Nc ~ 10 000, and the total number of instances is 6 771. The parameters for the algorithm are as follows: the 
convergence criterion used is , defined in (|47p . the maximum number of iterations allowed is T = 800, and the 
initial values of the messages are = for all (ka). For p = 0.3, approximately 99% of the samples converge within 



Average budget distribution parameter b 
0.2 0.24 0.28 0.32 0.36 




1 1.2 1.4 1.6 1.8 2 



Average budget B 

FIG. 6: Effect of random pinning fields on the numbers of frozen variables. The instances are the same as in Figure [5] 



500 iteration. 

The second problem of the original algorithm, namely that a large fraction of the variables are unfrozen for small 
values of B, can also be easily solved, by adding a random pinning field to the update equations, which become 
(including the reinforcement term): 

^ Hka{ul\l) + p{hl-' + Ul-' - 6ka) + Ska , (49) 
^ Uka{hl\J+S,a (50) 

where the pinning fields Ska are constant random real numbers, extracted from a uniform distribution in the interval 
]0, (5niax]- Notice that the pinning field also enters in the reinforcement term: since it affects both the updates of h^a 
and Uka , its effect is duplicated in the sum hka + Uka , which must therefore be corrected as hka + Uka — Ska (this also 
applies to the step functions defining the values of Xka and the convergence criteria). Figure [S] shows the effect of 
the pinning field on the average number of frozen variables: all the variables become frozen, except at most 6 out of 
10 000 in the samples considered. 

With these two modifications (i.e. the introduction of a reinforcement term and of random pinning fields), our 
algorithm is almost always capable to find a low-energy configuration of the problem. However, the introduction of 
these "spurious" terms in the update equations could lead to an increase in the energy of the configuration obtained, 
relative to the true optimal energy of the instance. In order to verify that this effect is not significant, we have compared 
the energy of the configuration found by the algorithm using both reinforcement and pinning fields, with the average 
energy computed according to equation ((55)) in the absence of them, for those instances for which convergence is 
obtained. Notice that this requires to compute the soft fields rjka and ^ka in addition to the hard ones, which we have 
done for the sake of the comparison. 

In Figure [7] we show the results of this comparison: the addition of reinforcement and pinning fields causes a 
systematic but very small increase in the average energy per advertiser. In order to quantify the relevance of this 
effect, its magnitude should be compared to the average revenue per advertiser rather than to the value of the energy. 
We find that the relative effect on the revenue is always smaller than 3 x 10^^, with an average value and a standard 
deviation of 3 x lO-"* ± 4 x lO-"*. 



B. Stability of the replica symmetric solution and phase diagram of the problem 



As we have seen in the discussion of paragraph IIV Al there is an interval of values of the parameter of the budget 
distribution B for which the algorithm (in the absence of reinforcement) does not converge, and this suggests the 
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FIG. 7: Comparison of the average energy per advertiser obtained with reinforcement and pinning fields to that obtained 
without them. The instances are the subset of those analyzed in Figure [5] for which the algorithm without reinforcement and 
pinning field converges; their total number is 4 914. The points corresponding to p = 0.3, Jmax = 10^^ have been slightly shifted 
horizontaUy to make the comparison possible. 



presence of a replica symmetry broken phase. In this paragraph, we are going to present a method which permits 
to verify this hypothesis directly, and to present some numerical results confirming it. This method was originally 
introduced in [I4I to verify the stability of 1-step RSBflRSB) solutions, but it is easily simplified to verify the stability 
of RS solutions as well, as was done for instance in [ij, [3l • 

The idea behind the method is very simple. The RS solution is a special case of a more general solution, the 
IRSB solution. The difference between the two is the following: while in the RS solution the value of the cavity 
fields hka at the fixed point are well defined, in the IRSB solution they are defined only in distribution. The correct 
parametrization of the IRSB solution is not given by a set of numbers {hka}, but by a set of functions {Vkaih-ka)} 
which describe the probability distribution of the values of hka- The RS solution is a special case of this, in which all 
the functions Vkai') are delta functions. 

A necessary condition for the RS solution to be correct is that it is stable under the enlargement of the solution 
space to the IRSB case. This can be verified by considering what happens under the iteration (fT51 [T7)) if instead of a 
sharply defined value the cavity fields hka have some small but finite variance Vka/^ around their fixed point values 
ht , i.e. if they have a distribution 



(51) 



where Afka is a normalization. Under the iteration (|15|17|) . the field hjb is computed as a function Jifjh of the fields 
hka that are at distance 2 from hji, on the graph representing the instance, i.e. 



J^Aihka}) = -Hjb {{Ukb{hk\b)\k e db\j}) 
If {hka} SLTC close to the fixed point values {h^^}, wc can expand 

h,.-^^A{Ka})+ E E 



keab\j a<£dk\b 



dhka 



{hka -hl^) + ' 



(52) 



(53) 



where ({/ifc^}) = h*^^. The probability distribution of the values of hjb is then given by 



Vjb{hjb)^T 



h*b 



E 



E 

kedb\j aedk\b 



dhka 



{hka -hl^) ~ hjb 



(54) 



which can be computed by writing the integral representation of the dcha function representing this probability and 
performing twice a gaussian integration. This gives 



Vjbihjb) = A^-bC-^''^'-''*^)'/"^'' (55) 

where we identify 

,^ _ V V f^3ldi!!kR\\, r5fi^ 

2. — dh;:^ — j ^'^^ (56) 

In order to verify the stability of the RS solution towards 1-step replica symmetry breaking, one can then associate 
a real number Vka to each cavity field hka, representing the variance of its distribution, and update the values of Vka 
with equation ([55)1 each time that the field h^a is updated. As the cavity fields reach their fixed point values {hl^}, 
one can perform one more iteration for all the edges {(fca)} in the graph and compare the values of {vka} before the 
iteration with those {w^.^} after it, and measure the stability parameter A: 

V v' 

A = = . (57) 

If A > 1 the RS solution will be unstable towards a IRSB enlargement of the solution space, otherwise it will be 
stable. 

Notice that, in general, the derivative dJ^fjb/dhka will involve also the soft field {£,ka}, which therefore must also 
be computed. This can be seen by writing explicitly 

f dJ^fjb\^ f dHjb\^ f dUkb\^ 

(t^) = E 1:^) E 1;^) (58) 

2 

> X 

(59) 



^ keab\j ^ ^ aedk\b ^ 



E 



aGdk\b 



max hkc 

cedk\b 



where the soft fields j]b\j are computed as functions of S,k\b from (jl6p . and where a;^^^ (z) is the set of partial configu- 
rations that minimize the function g(z,Xb\j) defined in (j20p . 

This procedure can be applied to any given instance. Alternatively, it can be used to verify if the RS solution is 
stable in the thermodynamic limit for the ensemble (o'oo(ca, Ck, 6), using the population dynamics technique: in this 
case, the topology of the problem is not fixed, but is extracted at random for each update. One then considers 
a population (i.e. a set) of N triples {hi,£,i,Vi). At each step, a local tree of depth 2 is extracted, with average 
connectivity Ca for the root and Ck for the nodes at depth 1. Each leaf of the tree is associated at random with a triple 
{hi,^i,Vi) from the population. The value of the fields and at the root are computed according to (fTsHIS]) . and 
the value of vq is computed according to (j56p . Then, a triple {hi,(,i^Vi) is selected at random from the population 
and is replaced with {ho,^o,VQ). This procedure is repeated a number of times Tq = tgN ^ N (the population 
size) in order to reach the equilibrium distribution of fields V{h). The population of variances is then reinitialized to 
Vka = 1 V(fca), and another Ti = tiN ^ N iterations are performed (renormalizing the variances after each AT = N 
iterations) so that the distribution of variances also equilibrates. Finally, T2 = 3> N iterations are performed, 
computing Xt according to (|57p . and averaging the results. Another quantity which is of interest to characterize the 
stability of the RS solution is the fraction of non-zero variances Vka hi the population, which we shall denote by 0. It 
can be computed together with A, and it must be zero for the RS solution to be stable. 

Using this approach, we have obtained the results shown in Figurc[Sl The parameters of the distribution of instances 
are the same as for the data shown in Figures [T] average connectivity Ca ~ 10 for the advertisers and Ck ~ 10/3 for 
the keywords; the population size is 10^; the equilibration time is = 1500 for the distribution of fields and ti = 1000 
for the distribution of variances; the measurement time is ^2 = 4000. We see that the region b G [0.268,0.337] 
(corresponding to S £ [1-41, 1.75]) is unstable, i.e. has A > 1. This is in very good agreement with the convergence 
results of the algorithm, which show a marked drop in the fraction of instances that converge in the same region. 
Figure [HI shows the corresponding average energy per advertiser and the average entropy per variable. The nature of 
the two transitions appears different, and it may be interesting to study with more depth how "sharp" is the one at 
B = 1.75. 
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FIG. 8: Stability parameter A and fraction of non-zero variances for different values of the budget distribution parameter b, 
obtained from the population dynamics technique. The average connectivities used are the same as in the previous plots, i.e. 
Ca = 10 and Ck = 10/3. The region 0.268 <b< 0.337 (corresponding to 1.41 < B < 1.75) is unstable, i.e. has A > 1. Outside 
of this region, both A and (p vanish. The values of A in the unstable region are well fitted by a line (except the first point), 
while the values of <f> are well fitted by a parabola. 



The previous analysis confirms the phase diagram suggested in paragraph llV Al for b <bo we have a RS phase with 
negligible average energy and positive entropy, for bo < b < bi wc have a RSB phase, and for bi < b we have again 
a RS phase, but with positive energy and negligible entropy. The threshold values bo and bi depend on the average 
connectivities Ca and Ck. 



C. Comparison to other algorithms 



We have compared zero-temperature BP to several other algorithms in terms of the average energy of the solutions 
they find. The alternative algorithms considered are Simulated Annealing (SA) and Linear Programming (LP) 
relaxation, with several rounding procedures. In the following paragraphs we shall briefly describe them, and then 
show and discuss the numerical results we obtained. 



1. Simulated annealing 

Simulated annealing is an optimization technique first proposed by Kirkpatrick et al. which is widely applied 
in the field of combinatorial optimization and which is inspired by the annealing of solids. It consists in a Metropolis 
procedure in which the temperature is gradually decreased. In the case of the off-line AdWords problem it is described 
in Algorithm IIV.II The inputs are an instance of the problem X, an increasing sequence of inverse temperatures 
{Pi, - ■ ■ , /?„} with /3i <C 1, /3n ^ 1 and 1, and a large positive integer T ^ 1. 

The values of T and of {Pi} must be such that the system thermalizes at any temperature Pi. In the process we 
can keep track of the configuration with the smallest energy explored in the sampling process, which would be the 
best approximation for the optimization problem we want to solve. Additionally, keeping track of the average energy 
for different values of temperature, it is possible to obtain the entropy of the system as S" = J dE p{E). 
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FIG. 9: Average energy per advertiser and average entropy per variable computed with the population dynamics method. The 
average connectivities used are the same as in the previous plots, i.e. Ca = 10 and Ck ~ 10/3. The shaded region b £ [0.268, 0.337] 
corresponds to the RSB phase: in this region, the values obtained for {Ea) and (Ska) in the RS approximation are (probably) 
not correct. 



Algorithm IV. 1: Simulated Annealing(X, T) 



for each k G K 

{select uniformly at random a G dk 
for each b £ dk 
do Xlt, ^ 5a,b 
3^miii ^ ) -E'min -S' (^^^min) 

t 0, i'r-1 

while [i < n) 

'while (t < iT) 

select uniformly at random k £ K 

select uniformly at random b £ dk\a with a : — 1 

^ka *~ 0, Xj^f, <— 1 

AS ^ E{x') - E(x') 
P ^ min[l, exp{-f3iAE)] 
select uniformly at random p £ [0, 1] 
if (p < P) 



do < 



do < 



then { if iE{x'+^) < En,in 



then Xn 



En 



E{x min I 



else 3-*+"'" 



return (xmin 



2. Linear programming relaxation with Random Rounding (RR) 



Linear programming (LP) is a common technique for optimization of linear functions over continuous variables. 
The variables are restricted to a given domain, usually called the feasible set, which is specified by a set of constraints 
that must be satisfied simultaneously by all the variables . Even though LP is defined on continuous variables, it 
is possible to exploit it in the design of approximation algorithms for problems in combinatorial optimization. The 
heuristics works as follows: first relax all the discrete variables in the original combinatorial problem by allowing them 
to take continuous values. Then use LP to find an optimal set of values for the continuous variables. In general, the 
optimal value for the cost function of the relaxed problem yields a lower (upper) bound for the original minimization 
(maximization) combinatorial problem, since the search space is enlarged by the relaxation. Moreover, if it turns out 
that the solution of the relaxed problem corresponds to all the variables taking integer values, then this is a solution 
for the discrete problem too. Otherwise, a strategy for rounding the continuous variables to integers must be specified. 
In general, the solution for the relaxation yields a mixture of discrete and fractional variables; it is important to bear 
in mind that the discrete values obtained in the solution of the LP relaxation do not necesarily correspond to an 
optimal solution of the original problem. Hence, in general, the rounding scheme should take into account not only 
the variables with fractional values, but the whole set of them instead. Figure [TOl describes an example of integer 
variables that can be assigned by LP to the relaxation of the AdWords problem which are incompatible with any 
optimal allocation. 

By introducing ancillary discount variables {da} representing the discount da given by the auctioneer to advertiser 
a in case the actual assignment exceeds the budget Ba, the AdWords problem can be restated as follows 



max^ ^kaWka - dt^ 



(60) 



subject to 

J2xka<l, (61) 

a 

XkaWka - da < Ba , (62) 

k 

Xka e {0, 1} , (63) 

which is now a mixed integer programming problem, as the discount variables are allowed to take any continuous 
value. With this formulation, the LP relaxation consists in replacing the integer constraints in (|63p with 



< xla, (64) 

where the superscript r stands for relaxation. This inequality, along with (|6ip . guarantees that < 1. Moreover, 
the discount variables da arc irrelevant in the relaxed version because the budget constraints (j62p can be met by 
tuning the continuous variables x^,^ themselves; in fact they all take value zero in the solution attained by LP. They 
will play an important role, though, in the rounding process as described below. 

A simple way to round the fractional solutions given by the LP relaxation is the following Random Roundin 
(RR): assign each keyword k at random to advertiser a with probability proportional to x^j,^. Theorems 4 and 5 of 
prove that the approximation ratio of RR is e/(e — 1) ~ 1.582. This means that if Rq is the optimal revenue for the 
problem considered, the solution given by RR will have a revenue R satisfying the inequality 

R< Ro< — ^i? . (65) 



3. Linear programming with Garg- Kumar- Pandit (GKP) rounding 

Garg, Kumar and Pandit [13] proposed another rounding algorithm for the solution of the LP relaxation of the 
problem. The GKP algorithm takes the integer values obtained by LP as part of the solution, and develops a strategy 
to round up the remaining variables with fractional values. Theorem 3 in proves that the algorithm to be described 
has an approximation ratio of (1 + \/5)/2 ~ 1.62, i.e. 



(66) 



(a) (b) (c) 

FIG. 10: An example of integer variables that can be assigned by LP to the relaxed problem which are not compatible with 
any optimal solution: (a) Two advertisers bidding for three keywords with budget Ba = 1 each; advertiser ai bids 3/4, 1/2 and 
1 for keywords ki, k2 and fcs, respectively, while advertiser 02 makes the same bids but in reverse order, (b) An integer optimal 
allocation with the budgets of both advertisers saturated; the missing links correspond to variables assigned to zero, except 
for keyword fc2 which can be assigned to whichever advertiser with no change in the total revenue, (c) A fractional optimal 
allocation, where the thick blue links correspond to the variables assigned to one, the missing links to those assigned to zero, 
and the dashed links to the fractional variables, which are equal to 1/2 in this case. Notice that keyword k2 is needed in order 
to saturate the budgets of both advertisers. 

with the same notation as in ([65]) . Notice that even though this bound is less favorable than ((65|) . the average case 
performance of GKP on some distribution of instances could actually be better than RR. 

The rounding part of the algorithm is best described by introducing the residual graphs which is the graph obtained 
by removing from the original one all the edges whose associated variables were assigned an integer value by LP. 
The algorithm applies a series of transformations on the residual graph, consisting of a suitable redistribution of 
the mass on the fractional variables. In this way, without decreasing the degree of optimality reached nor violating 
any constraint, the residual graph is left with certain special properties; among other things, it docs not contain 
cycles and, most importantly, it is guaranteed that any maximal path^ in it starts with an advertiser whose budget is 
saturated. If the (fractional) amount x'^^Wka spent by such an advertiser a on the only keyword k linked to him in the 
residual graph is less than a given fraction p of the budget, i.e. x'^^Wka < pBa, the corresponding variable is rounded 
to zero; otherwise the corresponding discount variable da is increased, opening up the possibility for the advertiser 
to acquire such a keyword during the remaining process [l7| . Even though the rigorous bound (|66p is proved for 
p = (3 — \/5)/2 ~ 0.38, for the ensemble studied here better results were observed by using smaller values of p. 

4- Linear programming with BP-based rounding 

Another alternative to round up the fractional variables is to use the BP algorithm to calculate the minimum 
energy, under the constraint that the integer variables of the LP relaxation are forced to take the actual integer values 
given by LP. 

5. Numerical results 

In order to compare these algorithms to zero-temperature BP, we have generated 300 instances from the distribution 
^(iVa,iVk,iVo,6) with Na. = 1000, N^, = 3 000 and ~ 10 000 and with different values of the budget distribution 
parameter b and we have solved each of them with all the algorithms cosidcred. We have grouped the instances in 
bins according to the value of the average budget B, and we have computed the average energy of the solutions found 
by each algorithm in each bin. The results of the comparison are shown in Figure 1111 

The performance of BP and SA relative to the algorithms based on LP rounding appears clearly superior. In 
particular, notice that the performance of BP-based rounding is much worse than that of BP (especially for low 
budgets) : the solution of the LP-rclaxation typically contains some integer variables that have the "wrong" value for 
the integer problem. An example of this can be seen in Figure [TOl This suggests that any LP-rounding algorithm 
which assumes that the integer variables in the solution of the relaxation are correct will find a sub-optimal solution 



A maximal path is a path that cannot be extended because there are no more edges in the graph contiguous to it. 
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FIG. 11: Comparison of the average energy per advertiser {Ea) obtained by the zero-temperature BP algorithm with those 
obtained by alternative optimization techniques, namely: LP with random rounding (LP-RR), LP with Garg-Kumar-Pandit 
(LP-GKP) rounding, LP with BP-guided rounding (LP-BP), and simulated annealing (SA). The solution of the relaxed linear 
program provides a rigorous lower bound for the energy. The curves for SA and BP almost coincide in all the range. The same 
set of instances with A'a ~ 1000 advertisers, A'k = 3 000 keywords, and average keyword connectivity Ck = 10/3 (which yields 
on average iVe = 10 000 variables) has been used for all the algorithms. 



to the integer problem. This situation is very different compared to what happens for random b-matchings, a similar 
problem where LP and BP both give exact solutions p^ . 

The differences in the average energies obtained by BP and SA are always very close to zero, and since the two 
algorithms are completely different, this suggests that they are also very close to the true optimum. This conjecture 
is also supported by the analysis of the results for the entropy (not shown): as the equilibration time is increased, the 
entropy obtained by SA gets closer and closer to the value obtained by BP, suggesting that the latter is in fact exact, 
which in turn implies that the energy is exact too. Finally, the results of population dynamics for B ^ [1.41, 1.75], 
which gives the average value (over the distribution S'^o) of the actual optimal energy per advertiser, are also in 
excellent agreement with the energy obtained with BP and SA, confirming that they are exact at least in this range 
of values of B. 

It is worth noting that even though the performances of BP and SA are very similar in terms of the energy of the 
configurations they obtain, the two algorithms are very different in terms of running time. The running time that SA 
requires to find a solution with energy Eg a ^ (1 + £)Ebp is up to 100 times larger than the running time of BP for 
TV = 1000 and e = 0.01. 



V. CONCLUSIONS AND PERSPECTIVES 



We have described a simple and efficient algorithm which is capable of finding (nearly) optimal assignments for 
random instances of auctions with budget constraints, based on a statistical mechanics analysis of the problem, 
and which is equivalent to the zero-temperature Belief Propagation procedure. We have characterized its average 
performance on a natural distribution of random instances and compared it to other previously studied algorithms, 
obtaining that it is superior to them in terms of the approximation to the true optimum and/or in terms of running 
time. We have also studied the properties of the instances distribution in the infinite size limit, deriving a phase 
diagram (as a function of the parameter b of the distribution) with a replica symmetry breaking phase separating 
two replica symmetric phases, and obtaining a reference for the average value of the true optimum (in the RS phase) 
against which to test our algorithm. 

In the future it would be extremely interesting to test our algorithm on real data (for example of actual bids for 
internet search keyword auctions), or on a distribution of random instances resembling the actual one more closely. 



It would also be interesting to study the generalization of this off-line algorithm to the on-line problem, again from 
the point of view of the average performance on some distribution of search sequences (possibly time-dependent). 
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